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ABSTRACT 

A new class of geometric statistics for analyzing galaxy catalogs is presented. Filament 
statistics quantify filamentarity and planarity in large scale structure in a manner 
consistent with catalog visualizations. These statistics are based on sequences of spatial 
links which follow local high-density structures. From these link sequences we compute 
the discrete curvature, planarity, and torsion. Filament statistics are applied to CDM 
and CHDM (tt v = 0.3) simulations of Klypin et al. (1996), the CfAl-like mock redshift 
catalogs of Nolthenius, Klypin and Primack (1994, 1996), and the CfAl catalog. We 
also apply the moment-based shape statistics developed by Babul & Starkman (1992), 
Luo & Vishniac (1995), and Robinson & Albrecht (1996) to these same catalogs, and 
compare their robustness and discriminatory power versus filament statistics. For 100 
Mpc periodic simulation boxes (Ho = 50 km s _1 Mpc -1 ), we find discrimination of 
~ 4(7 (where a represents resampling errors) between CHDM and CDM for selected 
filament statistics and shape statistics, including variations in the galaxy identification 
scheme. Comparing the CfAl data versus the models does not yield a conclusively 
favored model; no model is excluded at more than a ~ 2a level for any statistic, 
not including cosmic variance which could further degrade the discriminatory power. 
We find that CfAl discriminates between models poorly mainly due to its sparseness 
and small number of galaxies, not due to redshift distortion, magnitude limiting, or 
geometrical effects. We anticipate that the proliferation of large redshift surveys and 
simulations will enable the statistics presented here to provide robust discrimination 
between large-scale structure in various cosmological models. 

Key words: large-scale structure of the Universe — dark matter — cosmology: 
theory — methods: numerical — methods: data analysis 



1 INTRODUCTION 

In this paper we develop and apply statistics to quantita- 
tively characterize the shapes of galaxy distributions seen 
in redshift surveys. We then use these statistics to compare 
cosmological simulations of pure Cold Dark Matter (CDM) 
models versus Cold plus Hot Dark Matter (CHDM) models 
in real space, as well as simulated CfAl-like redshift surveys 
generated from these simulations versus the CfAl data. The 
four simulations used here are summarized in Table ^] and 



described in detail in §4.1. The availability of this suite of 
simulations of different cosmological models, all computed 
and analyzed in parallel, allows us to test the ability of these 
statistics to discriminate between such models. Visual com- 
parison of the simulations (Brodbeck et al. 1996; hereafter 



BHNPK) shows that the CDM galaxy distribution contains 
larger clusters and less well-defined filamentary and sheet- 
like structures than CHDM, consistent with the fact that 
CHDM forms structure at a later epoch than CDM. The 
statistics presented here confirm as well as quantify these 
results, showing statistically significant and robust discrim- 
ination between the models. 

Ever since the CfAl Survey (deLapperant, Geller & 
Huchra 1988) detected filamentary and planar structures in 
the galaxy distribution, many attempts have been made to 
develop statistics computed solely from the redshift-space 
positions of galaxies which quantify these large-scale struc- 
tures. It became apparent that the two-point correlation 
function contains very limited information about structure, 
while higher-order correlation functions are difficult to mea- 
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sure and mainly decribe the highest density regions. Thus 
alternative, more geometrical methods have been developed 
which contain information from all orders of correlation 
functions in such a way as to characterize the types of struc- 
tures seen in the surveys. The void probability function {e.g. 
Vogeley, Geller & Huchra 1991, Ghigna et al. 1994, 1996) 
and the topological genus statistic (see Melott 1990 for re- 
view, and Coles, Davies & Russell 1996 for related discus- 
sion) have had some success, and lately more complex statis- 
tics have been developed which seem promising such as the 
Minkowski Functionals (Mecke, Buchert & Wagner 1994 and 
Kerscher et al. 1996). 

Here we introduce filament statistics, a new class of geo- 
metric statistics designed to quantify filamentarity and pla- 
narity in large-scale structure. Filament statistics use infor- 
mation about the moments of the local mass distribution to 
characterize the shape of large-scale structure. In this way 
they are similar to the shape statistics of Babul & Starkman 
(1992; hereafter BS), Luo & Vishniac (1995; hereafter LV), 
and Robinson & Albrecht (1996; hereafter RA). However, 
the way in which the moment tensor information is used 
in filament statistics is fundamentally different from these 
shape statistics. Rather than randomly sampling the galaxy 
distribution, filament statistics use a prescription to map 
the galaxy distribution into a new set of points which ampli- 
fies the properties showing the greatest differences between 
models, namely filamentarity and planarity. While statistics 
applied to the new point set can be more discriminatory, 
care must be taken to develop a prescription which is robust 
against inherent variations in the simulated galaxy distribu- 
tion such as galaxy identification uncertainty and cosmic 
variance. Hence in constructing filament statistics, we are 
guided by the following principles: 

• It is best to attempt to directly quantify structures 
which visually show the greatest differences between models, 
i.e. filaments and sheets. 

• It is best to apply statistics directly to the point set 
of galaxies rather than a smoothed density distribution to 
avoid discarding information on small scales. 

• It is best to construct simple, interpretable statistics, 
in order to more easily understand their robustness against 
intrinsic uncertainties in the analysis. 

We find that both our filament statistics and the BS, 
LV and RA shape statistics yield statistically significant dis- 
crimination between CDM and CHDM simulations, which 
persists (though to a significantly lesser degree) even in a 
redshift-space comparison versus the CfAl data. A more 
informative comparison of these statistics must await the 
availability of larger, more complete redshift surveys as well 
as simulations capable of properly modeling these large vol- 
umes of space; both should be available soon. For now, we 
present these results to demonstrate the viability of the 
methods. We note that this is the first publication in which 
any of these moment-based shape statistics have been used 
to compare simulations to redshift survey data, which is the 
purpose for which they were originally devised. 



2 IMPLEMENTATION OF FILAMENT 
STATISTICS 



2.1 Dekel's Alignment Statistic 

Our filament statistics are related to the alignment statistic, 
originally proposed by Dekel (1984): For each galaxy, con- 
sider two concentric shells; find the moment of inertia ellip- 
soid axes defined by galaxies within each shell; and calculate 
the angle difference between the inertia tensor axes. Presum- 
ably, where the angle difference in the major axis is small, 
there is a filamentary structure present, and where the an- 
gle difference in the minor axis is small, there is a sheet-like 
structure present. By randomly sampling the galaxy distri- 
bution at different shell radii, one can then gain a measure 
of the filamentarity and planarity in large-scale structure at 
various scale lengths. However, we found that the alignment 
statistic barely discriminated between CDM and CHDM in 
real 3D space, and failed to discriminate the models in red- 
shift space. 

Since the visualizations of BHNPK show marked dif- 
ferences in the number, size, and continuity of filamentary 
structures in CHDM and CDM, we were inspired to consider 
mapping the point set of galaxies into another point set by 
an algorithm that sensitively favors contiguous high density 
regions. 

2.2 The Creation of Link Sequences 

The basis of filament statistics is the creation of link se- 
quences which follow along local high density regions, as 
determined by the principal axis of the local moment of in- 
ertia tensor. Link sequences may be thought of as a tech- 
nique to map the point set of galaxies into a new point set 
which emphasizes the higher density regions containing the 
filamentary and planar structures which we would like to 
quantify. A statistic applied to this new point set is likely to 
be more discriminatory than the same statistic applied to a 
random subset of galaxies; this is the case with the align- 
ment statistic. Most other statistics presented in the litera- 
ture (including BS, LV, and RA statistics) use sampling of 
the data points to obtain a measure of the global structure; 
filament statistics represents a new method for manipulat- 
ing the data to enhance the structures of interest, thereby 
enhancing the discriminatory power of a given statistic. 

A link sequence is an ordered set of points which can be 
visualized as joined by "links" , created by the procedure out- 
lined in the flowchart in Figure A link sequence is started 
from each galaxy in a catalog of galaxies (or if there are too 
many, a random subset of such galaxies). The moment of 
inertia tensor is computed using the masses and positions 
of galaxies within a range R of the given point; for redshift 
survey data, we weight by luminosity instead of mass. The 
eigenvectors and eigenvalues of the inertia tensor are found, 
and from these the principal axis is determined. The new 
point in the sequence is created at a distance L (the "link 
length") away in the direction of the principal axis, and a 
link is created which joins the old point to the new point. 
Note that only the first point in a link sequence is a galaxy; 
the others are simply locations within the catalog volume. 
A new inertia tensor is computed around this new point, 
and the procedure is repeated until termination. Sequence 
termination occurs when there are too few nearby galaxies 
to reasonably identify an axis. By this prescription, each 
galaxy generates a sequence of links. If a sequence has more 
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than Nh,min links, then statistics are computed on this link 
sequence, otherwise the sequence is discarded. The construc- 
tion of a link sequence is completely defined by choosing the 
link length L, the maximum radius of galaxies to be included 
in computation of the moment tensor R, and the criteria for 
termination of a sequence. 

Note that the principal axis of the inertia tensor does 
not define a unique direction. So from the initial point, the 
sequence is propagated in both (opposing) directions until 
termination, and the entire joined sequence is what is used 
for statistical analysis, as long as the total number of links 
is at least NL,mm- Generally, sequences tended to be non- 
intersecting but in some cases they oscillated between two 
points. When this is detected, the sequence is terminated. 



2.3 Constructing a Dimensionless Statistic 

We would like to construct dimensionless parameters which 
describe the shapes of structures. For that we need to ex- 
press all scales in units of some typical length scale of the 
catalog. A natural choice is the mean intergalaxy spacing 
d = (V/N) 1 ^ 3 , where V is catalog volume and N is number 
of galaxies in the catalog, since it provides a length scale in- 
dependent of galaxy clustering; it is also the simplest choice. 

We will consider applications in real space as well as 
magnitude-limited and volume-limited redshift space. Red- 
shift distortion will produce measurable effects on link se- 
quences, and attempts will be made to understand and quan- 
tify these effects. Whereas in real space the mapping of a 
galaxy into a link sequence is completely well-defined, in 
redshift space this is no longer true — redshift distortion 
for a given structure depends on the vantage point chosen 
to observe the structure. In this paper we introduce a new 
method for quantifying how redshift-space distortion affects 
these statistics. We show that none of the statistics ana- 
lyzed here are adversely affected by redshift distortion to 
any significant degree. 

A complication arises in computing d in magnitude- 
limited catalogs, since the sample incompleteness increases 
with distance from the Milky Way origin, making d a func- 
tion of radius from origin. A local computation of d around 
a given sequence point {i.e., using d = (V/N) 1 '' 3 for a local 
volume V around the given point) will degrade the statistics, 
since structure identification will be biased towards under- 
dense regions where d is large, which is exactly opposite of 
what is desired. Instead, d should be corrected using the se- 
lection function, which depends only on the distance r from 
the origin. Since (p(L)dL is the number density of galaxies 
between luminosity L and L + dL, we can obtain d(r) for 
galaxies visible above the magnitude limit as follows: 



d(r) 



1 -1/3 



4>{L)dL 



L Um( r ) 



(1) 



where Lu m (r) is the luminosity of a galaxy with apparent 
magnitude 14.5 (the CfAl magnitude limit) at a distance r. 
4>(L) is assumed to have Schecter form 



<f)(L)dL = <f)* 



L \ „ dL 

exp(-L/L ) — 



(2) 



The Schecter parameters <j>*, L* and a were best-fit to each 



real and simulated redshift catalog individually; this proce- 
dure is described in Nolthenius et al. (1994, 1996; NKP94 
and NKP96, respectively). Note that the true distance r is 
unknown, and is instead estimated assuming no peculiar ve- 
locities, i.e. r = v/Ho for a galaxy with radial velocity v. In 
the CfAl data, a few blueshifted galaxies (mostly in Virgo) 
do end up on the opposite side of the origin, but the statis- 
tics turn out to be insensitive to where these few galaxies 
are placed. d(r) is computed and used as the local mean in- 
tergalactic spacing at each sequence point in the analysis of 
magnitude-limited catalogs. 



2.4 Link Parameters 

The first parameter choice we tried was the simplest, with 
L = R — d. The virtue of this definition is that we have a 
parameter-free statistic, in the sense that the parameters are 
all determined from intrinsic properties of the data set. Un- 
fortunately, statistics derived from constructions with these 
"natural" parameters did not discriminate between models. 
R — d turns out to be too small to identify a local structure, 
and is dominated by shot noise. 

For link length L — d, but range R left as a free pa- 
rameter, we obtain discriminatory statistics; this choice of 
L appears to work as well as any other. However, for R — d, 
and L a free parameter, we again find little discrimination 
between models, or even from a Poisson catalog. A larger 
R will yield more points per sphere, thereby lowering shot- 
noise scatter. Since the R parameter controls the scales of 
structure being measured by the statistics, it is interesting 
and instructive to look at statistics as a function of R, and 
the results will be presented that way. 



2.5 Termination Criteria 

There are three parameters which set the termination crite- 
ria for a link sequence. Np t mi n is the minimum number of 
galaxies required within a sphere of radius R for a sequence 
to continue; Np tmin was set to 5 so that the determination of 
the principal axis would be statistically meaningful, and so 
that a sequence would terminate if it was in a sparse region 
in the catalog. NL,max sets the maximum number of links 
for a periodic catalog, and is set so that the total length of 
a sequence cannot exceed the length of the simulation box. 
In a redshift survey, the sequence terminates if it exceeds 
the catalog boundary. NL,min sets the minimum number of 
links for a sequence to be statistically meaningful. This was 
set to 4 links (the minimum value for computation of all 
statistics), but can be increased to explore more extended 
structures. However, since each link is typically fairly large 
(«3 Mpc in the simulations considered, and «15 Mpc in the 
sparser CfAl catalog), 4 links is already exploring a reason- 
ably extended scale. 

All the termination parameters were varied over fairly 
wide ranges. Np tm i„ was varied from 4 to 10 with little 
change in discrimination or robustness; any higher, and the 
shot noise generated from fewer sequences became signifi- 
cant. The statistics are independent of NL,max as long it 
is above about 10, below which shot noise from the small 
number of links becomes significant; it is ~ 34 in the peri- 
odic simulation boxes. Variations in NL,min had some effect 
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on the results for real and simulated redshift catalogs, since 
for very small values (2 or 3) shot noise increases from low 
link sampling, while for a high value (above 10), the number 
of acceptable sequences decreases so that catalog sampling 
shot noise becomes large. Discrimination was also insensi- 
tive to the choice of either a Gaussian, exponential, or top 
hat window function; we used a top hat for computational 
efficiency. 



2.6 Computation of Statistics 

Once link sequences have been generated, there are two ways 
the new point set may be used. We may choose to apply 
statistics which were previously applied to random subsets 
of the data to this new point set. Alternatively, we may de- 
vise statistics which measure the properties of link sequences 
themselves. Filament statistics are based on the latter idea, 
following the intuitive characterizations of structure given 
by the alignment statistic. 

We developed three statistics to compute on a link se- 
quence which measure filamentarity or planarity in an easily 
interpretable way. We call them planarity, curvature, and 
torsion. These statistics are in general defined as angle devi- 
ations between inertia ellipsoid axes for consecutive points 
along a link sequence; the exact definitions are as follows: 

• Planarity (dp) is the angle difference between the mi- 
nor axis of the inertia tensor for two consecutive points. The 
geometrical interpretation of planarity is as follows: Given 
that filaments in large-scale structure often occur at inter- 
sections of sheet-like structures, the minor axis of the inertia 
tensor along the filament measures the strength of the em- 
bedding sheet perpendicular to the filament; hence a lower 
planarity angle indicates the presence of a local sheet-like 
structure. 

• Curvature (6c) is defined as the angle difference be- 
tween two consecutive links. Equivalently, it is the angle 
difference between the major axis of the inertia tensor for 
two consecutive points. A sequence which is following a well- 
defined filament will have a low angle difference between 
links; hence a lower curvature angle indicates greater fila- 
mentarity. 

• Torsion (6t) is the angle difference between the plane 
defined by the first two links and the third link. Torsion 
measures the strength of the embedding sheet parallel to 
the filament, a lower torsion indicating a stronger planar 
structure present. 

In all cases, a lower value (angle difference) signifies 
more structure present in the catalog. As an example, con- 
sider a set of points distributed randomly throughout a long, 
thin cylinder. A sequence will track the cylinder, and the 
angle deviation between each successive link will be very 
small; hence curvature will show a very low angle devia- 
tion. Conversely, planarity and torsion will show large angle 
deviations since there is no locally preferred plane in a cir- 
cular cylinder. For a thin sheet, sequences will randomly 
walk throughout the sheet, yielding a high curvature angle 
(indicating no filamentary structure) , but low planarity and 
torsion angles (indicating lots of planar structure). 

In large-scale structure, filaments are often embedded 
within sheets, and thus these statistics are expected to be 



correlated. Nevertheless it is useful to consider each one sep- 
arately. A key difference between the statistics is that each 
requires a different number of sequence points to compute. 
Planarity is the most local statistic, being computed from 
only 2 link nodes, while curvature requires 3, and torsion 
requires 4. While planarity and torsion are in the ideal case 
purely measures of planarity, torsion is more sensitive to the 
presence of local filamentary structure since it measures an- 
gle differences along the sequence rather than perpendicular 
to the sequence. 

For each of those statistics, an average value is found 
within a single sequence. Then, for all the sequences in that 
catalog, a median value is found. We will denote the re- 
sulting averaged-then-medianed statistic by a bar, as in 6c- 
This final median value is the value of that statistic for the 
given catalog at the selected value of R. Errors analysis is 
discussed in section 4. 

2.7 Visualization and Algorithm Testing 

We have attempted to construct an algorithm which will 
identify and track filaments. We tested the algorithm on ar- 
tificially generated point sets of lines and planes of varying 
thickness. The results conformed to qualitative expectations, 
that lines should show a great deal of filamentarity and little 
planarity, and vice versa for planes. Also, the median angle 
deviations increased with thickness, as expected. Visualiza- 
tions showed that link sequences were tracking the structure 
as expected. 

When we visualized the link sequences which were gen- 
erated in an actual CHDM simulation, they tended to lie 
preferentially in regions of structure, but could not often be 
associated with visually recognizable filaments. They were 
also scattered throughout the simulation volume. This is 
because for the simulations we considered (which will be 
described in the next section), nearly every galaxy that 
was tried as a sequence starting point yielded a qualifying 
(NL,min > 4) sequence. Thus the parameter set we have 
chosen does not sufficiently restrict the generated sequences 
to lie directly along the filaments that are detected by eye. 
By imposing more severe requirements for sequence qual- 
ification, one can tune the algorithm to better recognize 
filamentary patterns. However, this reduces the number of 
qualifying sequences to a point where statistics are poor, 
and hence it is not useful for performing statistically signif- 
icant comparisons. Our conclusion is that this algorithm is 
not particularly suited for pattern recognition, and is better 
suited for statistical comparison of overall structural proper- 
ties of models. The statistics we compute have simple inter- 
pretations, and the results for various models are consistent 
with the BHNPK visualizations; however, this agreement is 
not necessarily apparent from visualizations of individual 
link sequences. 

Little effort went into developing analytical predictions 
for expected values of 6c, dp, and dp, even in the case of a 
Poisson catalog. This is due primarily to the fact that the al- 
gorithm was successful in the test cases we considered, and 
thus a complex and time-consuming analytical prediction 
was deemed to be low priority. Further numerical testing 
may also be done by superimposing lines or sheets of vary- 
ing strengths on a Poisson catalog, and determining how 
effective the algorithm identifies structure. We leave these 
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endeavors to the future, and instead for now concentrate on 
applications to the comparison of cosmological models. 



3 MOMENT-BASED SHAPE STATISTICS 

As a brief review, we present the definitions of statistics 
given by BS, LV, and RA. Since one of the authors indepen- 
dently derived the LV statistics (Hellinger 1995), we present 
those first and in somewhat more detail. The construction 
of each family of statistics is well-motivated and elegantly 
presented in the relevant papers; we refer the reader to them 
for further details, and here focus on the definitions. 

3.1 Luo && Vishniac Shape Statistics 

The statistics presented in LV are a three-dimensional ex- 
tension of the two-dimensional shape statistics devised by 
Vishniac (1986). In three dimensions, LV statistics are given 
by a linear combination of the quadratic coordinate moment 
invariants (summations implied by repeated indices): 

LV = d(M") 2 + C 2 (M 13 ) 2 + C 3 (M U )(M 3 ) 2 

-C 4 (M l3 )(M l )(M 3 ), (3) 

where 

M% = Tf J ^2 m k( x k ~ x o) 

k 

M%3 = Jj^2 m k(xl- X l )(xi- X J ) 
k 

M = ^m fc (4) 

k 

are summed over all galaxies at with masses vtik within 
a window radius -R of a central galaxy at xo. The constants 
d are determined by the constraints applicable for a given 
shape. For a "filamentarity" statistic we have 

{LV = for a spherical distribution entirely within R 
LV = for a distribution with a uniform gradient, 
of arbitrary size, in the window defined by R 
LV = 1 for a uniform linear density passing 
through the window center 

This yields the quadratic filamentarity statistic: 

+ l(M u ){M j ) 2 - fCM^XA^CM')} (5) 

appropriate for comparison of three dimensional real and 
redshift data. 

The quadratic statistic has the virtue of being a lowest 
order nontrivial moment invariant shape statistic, and thus 
can generally be expected to yield the strongest signals, but 
it correspondingly has the weakest ability to discriminate 
among different clustering shapes. For example, consider a 
data set where all of the galaxies are coplanar. Here we find 
that LVquad = 1/4, indicating that a purely planar struc- 
ture can imprint a weak signal; thus the quadratic statistic 
cannot fully distinguish lines from planes. Either we need to 
supplement this statistic with a complementary diagnostic, 
or we must sacrifice some signal strength and go to a higher 



order statistic. We follow LV and choose the latter method. 
This gives the following LV cubic structure statistics: 

+ ^M"(M kk ~{M 3 ) 2 ) 

+3M l3 (M jk - M 3 M k )(M lk ~ M l M k ) 

~M ij (M" - M k M k )(M' 3 - M'M 3 )} (6) 

LVpizne = jj^- ¥ {+4M^M' 3 ^M'M 3 ) 2 

-AM"(M kk - {M 3 ) 2 ) 

-l2M 13 (M 3k - M 3 M k )(M lk - M r M k ) 

+12M ij (M u - M k M k )(M 13 - M l M 3 )} (7) 

properly discriminating linear from planar structures. We 
also considered "flatness", an equally weighted combina- 
tion of linearity and planarity: LVa„t = ^(LVune + LV p i W e)- 
We thought the flatness statistic may be useful given that 
CHDM models tend to show both higher planarity and fil- 
amentarity than CDM models, but in final analysis it was 
quite similar to LVpUnc, so we don't consider it separately 
here. 

3.2 Babul & Starkman Shape Statistics 

The shape statistics presented in BS are derived from func- 
tions of the moment-of- inertia tensor F j = KP 3 — M l M \ 
where M IJ and M 1 are defined in equation ^| as averages of 
coordinate moments within a window of a specified radius R. 
Following the scheme introduced by Vishniac (1986), they 
define three structure functions: 

BS W oi = sin(|(l-^) p ) "PROLATENESS" (8) 
BS ohl = sin(|a(/Lt, v)) "OBLATENESS" (9) 
BS sph = sin(|(^)) "SPHERICITY" (10) 

where /i = yfhjh, u — yj To, / h , h > h > ^3 are the eigen- 
values of P J , p = jpg^g) ~ 2.71, and a(fi,u) is defined 
implicitly by 

a 2 a 2 (l - aai +/3o§) 

1 2 

with a = i3(i+33 )- 3 3 _ Lg54) and ^ = _7gi + a33 w 

0.854. The form of the structure functions and the values 
were chosen to give functions which are flat near the value of 
unity for a given morphology then fall to zero more sharply, 
reaching 0.5 at an axis ratio of 1:3. In our case, prolateness 
quantifies filamentarity, oblateness quantifies planarity, and 
sphericity quantifies the dumpiness of the galaxy distribu- 
tions. 

BS found that these statistics could discriminate be- 
tween cosmological simulations having Gaussian random ini- 
tial perturbations with varying power-law indices. These 
statistics were also recently combined with a percolation 
analysis and applied to various toy models of structure for- 
mation (Sathyaprakash, Sahni & Shandarin 1996). 
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3.3 Robinson &: Albrecht's Statistic 



4.1 The Halo Catalogs 



In a recent paper (RA) a combination of inertia tensor eigen- 
values was devised which yields a value of 1 for planar struc- 
tures and for filamentary structures. In RA it is called 
"flatness", but by the LV nomenclature it is actually a pla- 
narity measure, since it gives for filaments. It is given by: 



RA = 



V3(h-I 3 )^/P 1+ Il+I§ 



If + I\ + Jf + hh + hh + hh 



(11) 



RA found that this statistic was able to distinguish 
sheet-like non-Gaussianity in various toy models of cosmic 
string wakes. 



3.4 Test Cases and Structure Aliasing 

To test our implementation and better understand the quan- 
titative behavior of these statistics, we apply them to the fol- 
lowing three test cases, each one within a 100 Mpc periodic 
box: 

• "Line" - 1000 points randomly placed along a single 
line extending across the entire volume. 

• "Plane" - 5000 points randomly placed in a single plane 
extending across the entire volume. 

• "Sphere" - 5000 points randomly placed in a spherical 
distribution of 5 Mpc radius at the center of the box. 

We compute the statistics on each of these test cases, 
with window radius R = 5 and 10 Mpc. We show the results, 
along with the analytical value for a continuous distribution 
(or equivalently, the predicted value for R — > oo), in Table ^| 

In general, all computed values agree quite closely with 
the analytical value for all statistics. As R increases, the 
value approaches the predicted value, as expected. Note that 
the quadratic filamentary statistic LV qua d ~ 0.25 for a plane, 
indicating (as mentioned before) that this statistic does not 
completely distinguish lines from planes. Also, note that RA 
approaches unity at a slower rate than other planarity mea- 
sures, and concurrently yields a weak signal for a discrete 
sphere. 

The deviation from the analytical value is due to dis- 
creteness effects in these test case catalogs, an effect which 
becomes even more significant in the sparser sky catalogs. 
As an example, consider a window radius so small it only 
encompasses 3 points out of the Sphere catalog. This config- 
uration will yield a strong planar signal despite the topology 
of the underlying distribution. Analogously, 3 points within 
a planar structure can yield a strong linear signal if those 
three points happened to be somewhat colinear. We call this 
effect structure aliasing, and it primarily important in lower 
density regions, and for smaller window radii. In the simula- 
tions and redshift surveys, we would like to probe small-scale 
structure where the differences between models are greatest, 
but we are hampered by increased shot noise and structure 
aliasing. Hence we vary R to determine the optimal scale 
for discrimination, as well as to explore the behavior of the 
statistics at different scales. 



4 THE SIMULATIONS AND DATA 



All statistics were applied to the simulations described in 
Klypin, Nolthenius & Primack (1996; KNP96), which are 
100 Mpc 3 particle-mesh simulations on a 512 3 force resolu- 
tion grid. All had Q = l and Ho = 50 km s" 1 Mpc" 1 (which 
will be assumed throughout). A resolution element, or cell, 
is 195 kpc. The CDM simulations had 256 3 particles, while 
the CHDM simulations had 256 3 cold particles and 2 x 256 3 
hot particles, giving a cold particle mass of 2.9 x 1O 9 M 
and 4.1 x 10 9 Af Q for CHDM and CDM, respectively. There 
were two simulations with pure CDM, one with linear bias 
factor b = 1.0 (CDM1) and one with b = 1.5 (CDM1.5), and 
two CHDM simulations with 10% baryons, 30% in a single 
neutrino species and the rest cold dark matter. 

Both CHDM simulations have linear bias factors which 
are compatible with the COBE DMR results, while CDM1 
is nearly compatible, requiring some tensor contribution. 
CHDMi and both CDM simulations were started with iden- 
tical random number sets describing the initial perturbation 
amplitudes. It was found in NKP94 and KNP96 that Set 1 
had, by chance, an unusually high power (~ x2) on scales 
comparable to the box size. However, the CfAl data ap- 
pears to show similarly unusual power when compared to 
the larger APM survey data (NKP96, Vogeley et al. 1992, 
Baugh & Efstathiou 1993). CHDM2 had a power spectrum 
more typical of a 100 Mpc box. These four halo catalogs are 
summarized in Table [lj. 

Galaxies are identified initially as dark matter halos 
with Sp/p > 30 in 1-cell resolution elements (corresponding 
to about 4 cold particles in a cell) which are local maxima 
in density. Halos with M > 7 x 1Q 11 Mq were broken up to 
address overmerging (NKP96). 

We also tested filament and shape statistics on catalogs 
in which we identified galaxy halos as cells with Sp/p > 80. 
These catalogs gave basic results which were quite similar 
to the halo catalogs described above, with a slight increase 
in Poisson errors due to fewer numbers of halos. While the 
Sp/p > 30 catalogs have too many halos to be associated 
with visible galaxies, these catalogs still serve our purpose 
of testing whether these statistics can quantify structure 
and discriminate between models in real space. Comparisons 
with real data must be done using simulated redshift-space 
catalogs. 



4.2 The Sky Catalogs 

NKP94 and NKP96 describe the construction of the CfAl- 
like sky-projected redshift catalogs from the simulations de- 
scribed in the previous section, and the merged (to match 
simulation resolution) CfAl catalog. In order to distinguish 
these catalogs which are designed to mimic many observa- 
tional properties of the CfAl survey from the halo catalogs 
described above, we call the CfAl-like sky-projected red- 
shift catalogs the sky catalogs. Several items in sky catalog 
construction which are of relevance to filament and shape 
statistics are: 

• Six view points were chosen from within the CHDMi 
and CHDM2 simulations satisfying the conditions that the 
local density in redshift space (V < 750 km s -1 ) is within 
a factor of 1.5 of the merged CfAl galaxy density, and 
the closest Virgo-sized cluster is 20 Mpc away. The CDM 
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view points were required to be on the halos nearest to the 
CHDMi view point coordinates, and thus the correspond- 
ing sky catalogs, like the halo catalogs, differ only because 
of their underlying model physics and not cosmic variance. 

• To create a sky catalog of CfAl size (12,000 km s~\ 
2.66 steradians), the periodic halo catalogs were stacked, 
then cut to form the CfAl survey geometry; hence structures 
appear typically ~ 3 — 4 times, although distant galaxies are 
sampled sparsely. 

• Each sky catalog was cut to CfAl numbers before fitting 
a Schecter luminosity function (after monotonically assign- 
ing Schecter luminosities to mass). The scatter in Schecter 
function parameters among the six view points is thus con- 
volved into the statistics. 



4.3 The Effect of Halo Breakup 

The most massive halos in the simulation should generally 
have more than one individual galaxy associated with them 
(Katz & White 1993, Gelb & Bertschinger 1994). These 
"overmerged" halos were broken up as described in NKP96 
(it is the "adopted method" set of catalogs that was used 
here). Only 0.5% of CHDM halos required breakup, raising 
the number of halos with Sp/p > 30 by ~16%. CDM1.5 and 
CDM1 catalogs had higher fractions of massive overmerged 
halos, 1.3% and 1.7% respectively, raising their breakup halo 
populations by 35% and 56%, respectively. We expect the 
halo catalog results to be fairly insensitive to breakup since 
they probe scales ~3 Mpc and up, much greater than the 
radius over which fragments are distributed, which is typi- 
cally & 1 Mpc. Indeed we will show this to be the case in 
section 4.4. 

Despite the larger scales investigated, sky catalogs will 
be more sensitive to breakup. This is because breakup takes 
a single massive halo and fragments it into many closely- 
distributed objects, many of which survive the magnitude 
limit. When normalized to CfAl number density, the net ef- 
fect of breakup is to weight the massive halos more strongly, 
giving the appearance on average of moving galaxy halos 
into spherical groups (albeit with some "finger of God" elon- 
gation). For a dense catalog, overdense regions will be aug- 
mented at the expense of underdense regions, but for sparse 
catalogs like CfAl, only the densest clusters are augmented, 
at the expense of filamentary and planar structures. Hence 
halo breakup tends to systematically reduce the amount of 
filamentary and planar structure measured in sky catalogs. 

4.4 Cosmic Variance 

We will compare CHDMi to CDM1 and CDM1.5 to estimate 
the ability of the statistics to discriminate between models, 
but by using identical random number set initial conditions, 
cosmic variance is explicitly removed. Thus comparisons be- 
tween these simulations reflect only differences in the under- 
lying physics of the models. A proper measurement of the 
cosmic variance for these statistics requires performing many 
simulations of each model varying the random sampling of 
the initial power spectrum. With limited computational re- 
sources, we only have two such random samplings for a single 
model, viz., CHDMi and CHDM 2 . NKP94 and KNP96 es- 
timate the high power in CHDMi/CDMl/CDM1.5 would 



be expected ~ 10% of the time, translating to a ~ 1.7a 
deviation from norm, while CHDM2 was found to be quite 
typical. Thus CHDMi vs. CHDM2 may be taken as a crude 
estimate of la cosmic variance. However, a statistic which 
shows little difference between CHDMi and CHDM2 does 
not necessarily have negligible cosmic variance, since with 
only two realizations the possibility that the small deviation 
is merely a fortuitous coincidence for that statistic cannot 
be ruled out. In the future, constrained realizations of the 
local universe should bypass uncertainties from cosmic vari- 
ance by constraining the poorly sampled large-scale waves 
in the simulation using redshift surveys (Primack 1995). 



5 RESULTS FOR HALO CATALOGS 

5.1 Filament Statistics Applied to Halo Catalogs 

Figure [] shows the results for filament statistics planarity 
8p, curvature 8c, and torsion 6t vs. R/d applied to the 
halo catalogs catalogs after breakup. The statistics were 
computed for each R/d from 1.2 to 2.5 in increments of 
0.1. To estimate errors in the halo catalogs, each statistic 
was computed over a random subset of the catalog. The 
subset was taken to be as many halos as necessary to gen- 
erate 500 link sequences. Even for R/d = 1.2, this never 
required more than 650 halos; at high R, hardly a few per- 
cent of the halos generated sequences which did not meet 
the NL,min = 4 criterion. The error bars shown in Figure ^ 
are 3a resampling errors. The catalog was then resampled 
10 times to obtain an error estimate. Since there are more 
than 34,000 halos in each catalog, the data is not oversam- 
pled. At R/d = 1.2, there were on average 5.6 links per 
sequence; this number rose steadily until R/d > 1.6, where 
sequences were was almost always terminated due to the 
NL,max ~ 100 Mpc/d « 34 criterion. The average number 
of halos within a sphere of radius R around a given sequence 
point rose from ~10 at R/d = 1.2 roughly linearly to ~50 
at R/d = 2.5. 

Figure ^ shows that all three statistics are generally 
higher for the CDM simulations as compared with the 
CHDM simulations, indicating that CDM is less filamentary, 
has fewer sheet-like structures, and has greater dumpiness 
than the CHDM simulations. These results are consistent 
with the notion that CDM possesses more evolved struc- 
tures with clumpier mass distributions, while the presence 
of neutrinos in CHDM models results in more extended and 
less evolved structures This notion is confirmed by the visu- 
alizations of BHNPK. Thus filament statistics provide quan- 
titative differentiation between large-scale structure seen in 
the halo catalogs. 

Note that all the statistics tend to fall with increasing R. 
This reflects the fact that as the ratio of R/L increases, the 
greater overlap between adjacent spherical windows gener- 
ates stronger correlations between adjacent inertia tensors, 
thereby reducing the angle deviations between neighboring 
inertia ellipsoid axes. There is an additional effect that is pe- 
culiar to catalogs possessing inherent filamentary structure: 
Consider a link sequence tracing a path defined by points 
contained in a "filamentary structure" of radius R cy i. As 
we increase R/L we see an increasingly more linear distri- 
bution of points in the window, thus lowering the value of 
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0c ~ \ a,rcsm(R C yi / R) . A similar argument holds for pla- 
narity and torsion. In reality the galaxy distribution is more 
complex, but the basic result is that sampling large-scale 
structure gives Oc(R), 9p(R), and 9t{R) falling at rates 
greater than in the Poisson case. 

The large difference between simulations and the Pois- 
son catalogs provides a good indicator of how effectively 
structure is identified by filament statistics. Link sequences 
identify and follow structure in a Poisson catalog by detect- 
ing chance alignments of halos which masquerade as contigu- 
ous structure due to finite numbers of halos in a given win- 
dow. As mentioned before, this structure aliasing is primar- 
ily a low-galaxy-density phenomenon, and hence is most sig- 
nificant at low R, where all sequences barely exceed NL,min , 
and each window barely has Np, m i n halos. In this situation 
the majority of sequences which qualify will be those lying 
along such rare chance alignment of halos. Increasing Np. m i n 
and NL,min reduces structure aliasing, but the correspond- 
ing reduction in qualifying sequences increases shot noise 
significantly. Instead, we simply choose to be careful about 
our interpretations at low R. For instance, for R/d < 1.3 the 
Poisson catalog statistics rise with R, indicating that aliased 
structure is significant here. Structure aliasing occurs in the 
models as well, but is less apparent because halos are cor- 
related, yielding more halos surrounding a given point than 
in the Poisson case. Nevertheless the reduced discrimination 
for R/d < 1.3 is an indication that aliased structure is of 
comparable strength to real structure at these scales. 

At low and high 7? values, filament statistics discrim- 
inate between the CDM models with different biases, as 
shown in Figure [| At R/d < 1.3 CDM1.5 aliases struc- 
ture more effectively than CDM1 since it is more diffuse 
(more Poisson-like) , while at larger scales (<; 7 Mpc) the 
enhanced clustering of CDM1.5 (see BHNPK) tends to trap 
sequences in spherical clumps more effectively than CDM1, 
giving higher values. Identification of these effects over a R 
of 1.0-2.5 (roughly 3.0-7.5 Mpc) indicates the high sensitiv- 
ity of these statistics to the presence of structure. 

The two CHDM simulation results are within ~ la of 
each other on scales investigated. Hence for these statistics, 
cosmic variance between CHDMi and CHDM2 should be 
comparable to resampling errors in the halo catalogs. 

5.2 Shape Statistics Applied to Halo Catalogs 

We applied the BS, LV, and RA statistics to the halo cata- 
logs after breakup. To compare with filament statistics, we 
took ten sets of 1000 eligible halos each to compute resam- 
pling errors, varying the window radius R from 1.2d to 2.5d. 
Eligible halos were those which had five or more other ha- 
los within the selected radius R, analogous to the filament 
statistics computation. BS points out that at least 12 ha- 
los are required within a window radius to avoid structure 
aliasing and reliably identify a planar structure; however, 
for such a high value, few halos are eligible and hence shot 
noise reduces the discriminatory power significantly. Lower- 
ing this number to three increases the discriminatory power 
slightly, but structure aliasing becomes more significant at 
small scales. The results for LV qU nd and LV p i mlc along with 
the RA statistic applied to the halo catalogs after breakup 
are shown in Figure ^, while the results for the BS statis- 
tics are shown in Figure 0. We omit LVu„ c for redundancy; 



it gives lower discrimination than LV p ianc but otherwise has 
very similar behavior. The error bars shown are 3er resam- 
pling errors. 

For R/d < 1.5, the Poisson model is poorly discrimi- 
nated from the cosmological models for all statistics. This 
is due to strong structure aliasing in these small windows 
(R ;$> 5 Mpc), making the statistics clearly untrustworthy 
at these scales. To avoid these spurious detections of struc- 
ture we focus on the regime R/d > 1.6. Recall that for fila- 
ment statistics, structure aliasing was problematic only for 
R/d ^ 1.3, so filament statistics are able to discriminate 
true structure from aliased structure at smaller scales. 

For R/d > 1.6, the results of the shape statistics 
are consistent with BHNPK visualizations. CHDM mod- 
els show higher filamentarity (BS pro i and LV qua d) and pla- 
narity (BSobi, LV p i an c, and RA) than CDM models, while 
the sphericity measure BS sp h is higher for the CDM models. 
However, CDM1.5 is generally closer to the CHDM models 
until R/d <; 2.0. Comparing the values for BS pro i and BS m 
show that the halo distribution is more oblate than prolate, 
i.e. that large-scale structure in these models is dominated 
by sheets rather than filaments. A comparison of LV p i an e and 
LVu ne (not shown) yields the same conclusion. Thus these 
shape statistics, like filament statistics, confirm and quan- 
tify the visually apparent differences between these models. 

As R/d increases, the errors become smaller primarily 
due to more halos being included within each window. The 
statistical values also decrease partly due to a reduction in 
structure aliasing, and partly because larger windows tend 
to sample more spherical mass distributions. 

All statistics appear to be fairly sensitive to the chosen 
bias as well as the chosen set of initial conditions. CDM1 is 
well discriminated from CDM1.5 except in the region around 
R/d ~ 2.5 where their curves intersect; CDM1, being the 
more evolved model, contains less filamentary and planar 
structure at small scales. For the LV statistics and BS sp h, 
the cosmic variance estimated from the difference between 
CHDMi and CHDM2 begins to dominate over resampling 
errors for R/d <; 2.0, while the RA and BS u shows <; 2a 
cosmic variance at all scales. Only for BS pro i is the cos- 
mic variance always comparable to the resampling error for 
R/d > 1.6. 

In comparison with filament statistics, the shape statis- 
tics give the same conclusion regarding structure formation 
in the various models, but appear to have more sensitivity 
to cosmic variance (with the exception of BS plo i), and are 
more susceptible to structure aliasing at small scales. The 
large difference between CDM1 and CDM1.5 indicates shape 
statistics are more sensitive to the normalization of the cos- 
mological model than filament statistics; to some degree, 
this makes shape statistics a complementary diagnostic. 



5.3 Measuring the Discriminatory Power 

We now introduce a set of metastatistics to compare statis- 
tics and assess the effectiveness of our analysis. These metas- 
tatistics will allow a direct comparison of the discrimina- 
tory power and robustness of filament statistics versus shape 
statistics. Discrimination between models for a given statis- 
tic 8 can be measured by the signal strength Sf os between 
catalogs: 
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'S'rcs (1) 2) — 



(12) 



We first measure the halo identification uncertainty fac- 
tor, given by 



where 8\ and 82 are values of statistic 8 for catalogs 1 
and 2, respectively, and ag is the resampling error for that 
statistic and catalog. The subscript "res" denotes that the 
units of Sf ea are la resampling errors. To compare CHDM 
to CDM at (roughly) COBE normalization while exclud- 
ing cosmic variance, we compare CHDMi to CDM1. Fig- 
ure |(a) shows 5r es (CDMl,CHDMi) for the halo catalogs for 
8 = {6p,8c,0t}- computation) For R > 1.4, where struc- 
ture aliasing is unimportant, planarity shows the highest 
signal, then curvature then torsion, with all statistics show- 
ing S'rcs (CDMl,CHDMi)<; 4cr. Thus filament statistics are 
fairly discriminatory for the halo catalogs; their robust ness 



against halo breakup will be formally investigated in §5.4 



Cosmic variance is not expected to dramatically degrade 
the discrimination as the differences between CHDMi and 
CHDM2 are comparable to the resampling errors. 

The signal strengths for the LV, BS, and RA statistics 
applied to the halo catalogs after breakup are shown in Fig- 
ure ^b (bottom panel). For most statistics, discrimination is 
better than 5a for R/d > 1.6. The significant exception is 
BSproi, with barely ~ 2 — 3a discrimination for R/d > 1.8. 
However, recall that BS plo i was also the only statistic whose 
cosmic variance was small compared with resampling errors. 
With cosmic variance taken into account, BS pTO i appears to 
be quite comparable in discriminatory power to the other 
statistics. In general, both filament and shape statistics ap- 
pear to be easily discriminate structure formation in CDM 
models versus CHDM models. Cosmic variance, though, ap- 
pears to be more of a concern for the shape statistics, with 
the exception of BSproi- 



5.4 Robustness Against Halo Breakup 

The identification of galaxies in simulations represents a ma- 
jor uncertainty in this type of analysis. Given that the simu- 
lations are dissipationless, it is not possible to directly iden- 
tify clumps of baryons which would be expected to form 
galaxies. Instead, assumptions must be made regarding how 
the baryonic matter traces the dark matter. In addition, be- 
cause of the limited resolution of the simulations, a single 
clump of dark matter may contain several galaxies (often re- 
ferred to as the "overmerging problem"), and must be bro- 
ken up to obtain a true sample of galaxies. The detailed 
assumptions made in this procedure (described in NKP96) 
are somewhat ad hoc, so it is important to somehow quantify 
the uncertainty introduced by our lack of knowledge. 

To do this we compare the statistical values for the 
catalogs before breakup and after breakup. The effects of 
breakup on these statistics are expected to be as follows: 
Because single halos are broken into several spherically- 
distributed halos, the tendency will be to decrease the de- 
tection of planar and filamentary structure, and increase the 
detection of spherical structure. This is in fact what is seen. 
Since it is clear some sort of halo breakup must be done, and 
that halo breakup has a monotonic effect on the statistics, 
a comparison of catalogs before and after breakup should 
yield a somewhat conservative estimate of the uncertainty 
introduced by this procedure, unless the breakup scheme 
used here produces far too few fragments. 



F d (8) 



\8bu — #nobu| 
y/lagbu 2 + o"Snobu 2 ) 



(13) 



where 9 represents the value of the statistic in question, bu 
and nobu refer to breakup and no-breakup catalogs, and 
ag represents the resampling error for statistic 8. We then 
combine this error with resampling errors to obtain the com- 
bined signal strength metastatistic, presented as a function 
of radius R: 



S res + id(l,2) = 



g e res(l,2) 
MAX[1.0,F id {8)] 



(14) 



For the filament statistics applied to the halo cata- 
logs, we compute Sfes+id (CHDMi, CDMi) for each statis- 
tic for R — 1.2 — 2.5. The results are plotted in Fig- 
ure ^|(a) (top panel). Torsion show no degradation of sig- 
nal, as Fh(0t) < 0.4 for all values of R; this statistic is 
highly robust against halo identification uncertainty. Cur- 
vature, conversely, shows some degradation, as it typically 
has Fid(0c) ~ 1.7. Planarity, which showed the highest sig- 
nal strength, is by far the least robust, with Fi d (8c) ~ 2 
and as high as 2.5 at some values of R. Comparing with 
Figure |^(a), torsion now appears marginally to be the best 
filament statistic, showing typically 4a robust discrimina- 
tion between models, while curvature and planarity have 
•Sts+id (CHDMi, CDMi) ~ 3. Recall from Figure | that all 
statistics show CHDMi and CHDM2 being & la apart, so 
these conclusions should not be dramatically affected by cos- 
mic variance. 

Figure ^|b (bottom panel) shows the combined signal 
strengths computed shape statistics applied to the halo cat- 
alogs. For all the statistics except RA, the results are gener- 
ally insensitive to breakup. For RA, Fid ~ 1.5 typically, but 
it still leaves RA with comparable discriminatory power as 
other shape statistics. While no shape statistic is clearly op- 
timal by this measure, BSobi, BS sp h, and LVpiane appear to 
show the strongest discrimination; however cosmic variance 
is a concern for those statistics. BS pro i, which had low cos- 
mic variance, is not very discriminatory. Overall, there are 
statistics from each category showing over 4<j discrimination 
which is robust against variations in the galaxy identifica- 
tion scheme. 



6 RESULTS FOR SKY CATALOGS 

6.1 The Effect of Redshift Distortion 

Distortion of structure due to peculiar motions of individual 
galaxies (e.g. fingers of God) could in principle significantly 
degrade the ability of all of these statistics to quantify true 
structure. In our case, since CDM contains higher peculiar 
velocities, one might expect stronger fingers of God which 
could mimic the true filamentarity contained in CHDM 
and thereby work against the discriminatory power of these 
statistics. This does not turn out to be the case, however, 
because fingers of God are elongated only in the line-of-sight 
direction f in redshift space, whereas in general a structure 
in real space will not be aligned with f . Thus redshift distor- 
tion tends to smear out and hence decrease the amount of 



© 1996 RAS, MNRAS 000, 000-000 



10 R. Dave et al. 



structure detected in these simulations, slightly more so in 
models with more redshift distortion. For filament statistics, 
a further effect of redshift space distortion is to misguide se- 
quences and increase the angle deviation of a sequence pass- 
ing through a cluster. The net result is that redshift distor- 
tion does not significantly undermine the ability of any of 
these statistics to distinguish between cosmological models, 
and in some cases slightly enhances the discrimination. 

To test the effect of redshift distortion we adopt the 
strategy of applying the statistics to mock redshift cata- 
logs constructed to exaggerate the distortion due to pecu- 
liar velocities. We first cut the halo catalogs at a high den- 
sity threshold, roughly mimicking CfAl sparseness. Then for 
each halo we compute the line-of-sight velocity vi oa with re- 
spect to an observer at one corner of the simulation volume. 
We then multiply this velocity by Fy, the velocity scaling 
factor, and shift the halo position along the line of sight by 

Ax = FyViosM/Ho (15) 

where Af is the direction from the box center to the given 
halo. Thus Fy = corresponds to real space, Fy = 1 cor- 
responds to redshift space, while higher Fy yields an exag- 
gerated shift from which we can gauge the sensitivity of the 
statistics to redshift distortion. We choose R/d = 1.5 when 
applying the test to filament statistics, and R/d = 1.8 when 
testing the other shape statistics. 

Figure ^(a) (left side) shows the effect of the transfor- 
mation from real to redshift space upon filament statistics 
for halo catalogs with a density threshold of Sp/p > 80, 
while Figure ^(b) (right side) shows the equivalent results 
for Sp/p > 120 catalogs. At Fy — 1, we see a definite in- 
crease in the discrimination between the models as compared 
to Fy = 0, most notably for the curvature and torsion statis- 
tics. This effect is more pronounced in the sparser catalogs. 
The statistical values increase, indicating structure is be- 
ing smeared out by redshift distortion. The trend continues 
to higher Fy, with no dramatic dropoff in the discrimina- 
tory power of filament statistics. This test was also run on 
the pre-breakup versions of the same catalogs, and it was 
found that the interpretations are virtually independent of 
breakup in both real and redshift space. 

Figure ^ shows the results of the redshift distortion test 
for three selected shape statistics (the others show similar 
behavior). As with filament statistics, less structure is de- 
tected when redshift distortion is included. Unlike filament 
statistics, however, there is no apparent increase in the dis- 
riminatory power of these statistics at Fy = 1. Also, exag- 
gerated redshift distortion (Fy > 2) has little further effect 
on the statistics. Again, these results are fairly insensitive 
to breakup and catalog density. 

In summary, the primary effect of redshift distortion is 
to decrease strength of structure, which if anything will work 
to amplify the discrimination between the models consid- 
ered. Filament statistics, which emphasize regions of higher 
density, are more affected by redshift distortion than the 
randomly-sampled shape statistics, and we see greater am- 
plification of discrimination between models. 

6.2 Filament Statistics Applied to Sky Catalogs 

Figure [] shows the results of filament statistics applied to 
the sky catalogs after halo breakup. Every galaxy in each sky 



catalog was tried as a possible sequence starting point. For 
each catalog, at R = 1.2, around 800 of the «2360 galaxies 
typically generated sequences with number of links exceed- 
ing NL,min = 4. This number rose roughly linearly until 
R = 2.5, where ~2200 galaxies qualified, on average, in each 
catalog. There were systematic differences between the cat- 
alogs as well, with CHDM2 showing the largest number of 
accepted sequences, about 5—10% more than the CDM mod- 
els. CHDMi showed the lowest number, consistently slightly 
below the CDM models. At R — 1.2, there were on average 
about 6 links per sequence; this number rose fairly linearly 
with R, such that at R — 2.5, there were around 20 links per 
sequence. The average number of galaxies within a sphere 
of radius R around a given sequence point rose from 8-10 
at R = 1.2 roughly linearly to 25-30 at R = 2.5. 

The error estimate for each statistic in sky catalogs was 
determined from sky variance, by computing the statistic at 
each of six vantage points, and getting an average value and 
standard deviation for that statistic. The error bars shown 
in Figure^ are la sky variance errors. Since our box is rela- 
tively small, different viewpoints are still seeing many of the 
same structures, although with differing depth. Sky variance 
is therefore expected to underestimate true cosmic variance, 
perhaps significantly. 

Figure |9| shows that both CHDM models still show more 
structure than either CDM model, consistent with our intu- 
itive picture of structure formation in these models, and all 
models are fairly well discriminated from the Poisson cat- 
alog. However, CHDMi shows significantly more structure 
than CHDM2, by up to ~ 2a for the torsion and curva- 
ture statistics, indicating that that sky variance is an in- 
adequate estimate of cosmic variance. The extra large scale 
power in CHDMi, accentuated by the artificial replication 
of structure in the construction of the sky catalogs at 100 to 
100\/3 Mpc intervals, produces more large-scale structure in 
CHDMi than in CHDM2. This is more apparent in sky cat- 
alogs than in the halo catalogs since the scales investigated 
are much larger, with d ~ 10 Mpc even in the region where 
the sky catalogs are complete. 

To test sensitivity to shot noise and catalog bound- 
ary effects, filament statistics were applied to (nearly) full- 
sky versions of the CfAl-like sky catalogs with a zone 
of avoidance \b\ < 10° about each viewpoint, covering 
10.384 sr instead of 2.66 sr and containing about four times 
as many galaxies (w 9200). Since the 2.66 sr catalogs and 
the 10.384 sr catalogs are derived from the same simulation 
data set, we are still sampling from the same distribution 
of cluster sizes and shapes. The resulting signal strength 
increased by a factor of ~ 2 (for R/d > 1.3) as expected 
if the errors are dominated by shot noise. The degradation 
of the signal from the halo catalogs to the sky catalogs is 
thus primarily due to sparseness. For a survey such as the 
Optical Redshift Survey (Santiago et al. 1995,1996) which 
covers 8.09 sr at CfAl depth, we expect to see well over 3a 
discrimination between models, excluding cosmic variance. 

We quantify boundary effects by comparing statistical 
values for the 2.66 sr catalogs vs. the 10.384, and find that for 
R/d J> 2.0, the CfAl-like catalogs show significantly higher 
values (comparable to sky variance) than the 10.384 sr cata- 
logs, indicating that the entire catalog volume was contribut- 
ing as a single radial filamentary structure. This was also 
evident from visualizations of the link sequences, as at large 
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R the sequences were preferentially radially directed. Visu- 
alization also showed that link sequences were distributed 
throughout the sky catalog volume, with very few lying in 
the foreground, r & 20 Mpc. Recall that d(r) is small at low 
r, and the Virgo Cluster, being nearby, contributes hardly 
any sequences even though it gives a large finger of God. 
At small R, sequences tended to be shorter and terminate 
within the catalog volume, while at large R they tended to 
terminate once they exceed the catalog boundary and find 
no nearby galaxies. 

The statistics were also applied to 80 Mpc volume- 
limited versions of the sky catalogs, with typically 400- 
500 galaxies in each. The statistics showed very large shot- 
noise scatter, and gave no significant discrimination between 
models. Volume limiting certainly yields more interpretable 
statistics, but for CfAl and our similar-size simulation sky 
catalogs, there are simply too few galaxies. 

6.3 Shape Statistics Applied to Sky Catalogs 

In Figure |l^ we present the (selected) LV and RA statistics 
and in Figure 111] the BS statistics, applied to the sky cat- 
alogs after halo breakup. Also plotted as solid lines are the 
results for the CfAl catalog. Each statistic was computed 
around every galaxy in the sample. The errors are increased 
greatly over the halo catalog case because of the sparseness 
of these CfAl-like catalogs; the error bars shown are la sky 
variance errors. 

The interpretation of statistical values again confirms 
the intuitive picture of structure formation in these models. 
CHDM models show greater filamentarity and planarity and 
less sphericity than CDM models for R/d J> 1.5. As with the 
halo catalogs, the galaxy distribution shows stronger pla- 
narity than filamentarity, with BS m typically twice BS pro i 
at any given R. Also, the CDM models show different trends 
versus R analogously to the halo catalogs, with CDM1.5 
dropping faster versus R/d than CDM1. 

For the LV and RA statistics applied to the sky catalogs 
(Figure 0), structure aliasing is a significant concern. The 
Poisson catalog is not discriminated from the models until 
the scales are quite large, R/d ~ 1.8 for LV qvm d and RA. 
The higher order LV statistics (represented in Figure [l(] by 
LVpiane, LVu nc and LVg at show similar behavior) are never 
well discriminated from the Poisson catalog. As with the 
halo catalogs, filament statistics do a better job avoiding 
aliased structure at small scales. The CHDM models are 
well separated, indicating that cosmic variance dominates 
over sky variance for these statistics. In all, the LV and RA 
statistics appear less able to reliably quantify structure than 
filament statistics in a catalo g as sparse as CfAl. 

The BS statistics (Figure UAft do not have quite as much 
difficulty distinguishing a Poisson catalog from the cosmo- 
logical models as the LV and RA statistics, although they 
still cannot discriminate for R/d < 1.5. BS pro i, just as in 
the halo catalog case, shows remarkable little cosmic vari- 
ance for R/d <; 1.8, although with only two realizations of 
CHDM the possibility that this is merely a fortuitous coin- 
cidence cannot be ruled out. For BS bi and BS sp h, cosmic 
variance is again a significant source of uncertainty, with 
CHDMi generally closer to the CDM models than CHDM 2 . 

We also applied these statistics to the full-sky versions 
(10.384 sr) of the sky catalogs. The discriminatory power 



of the best statistics increased only to ~ 3a, showing that 
these statistics are not completely dominated by Poisson 
noise, and are more affected by halo identification uncer- 
tainty than filament statistics. As with filament statistics, 
boundary effects become significant for 7?/d <; 2.0. 



6.4 Robustness and Discrimination Between 
Models 

In Figure ^ we present the combined signal strength 
•S'sv+id(CDMl,CHDMi) for all the statistics applied to the 
sky catalogs. The subscript "sv" signifies that we are includ- 
ing sky variance errors. The results before breakup are not 
shown, but for most statistics, the sparseness of the CfAl 
catalog generates sky variance errors which dominate over 
halo identification uncertainty, so Fid < 1 at nearly all R. 
The exception is the RA statistic, which had Fid ~ 1.5 typi- 
cally. As described in section i.Z the catalogs before breakup 
show slightly more structure than after breakup. It turns out 
that for filament statistics, this represents a ^ 1° increase 
in each statistic for the sky catalogs, which is generally less 
than sky variance errors. There is little qualitative difference 
in Sf v+id (CDMl,CHDMi) for no-breakup sky catalogs. 

Figure [j^(a) shows that for filament statistics, discrim- 
ination between CHDMi and CDM1 is strongest in torsion 
(~ 2.5cr) and curvature (~ 1.5 — 2a), while planarity shows 
no significant discrimination between CHDM and CDM. 
Planarity is weaker because it is not as significantly am- 
plified by redshift distortion as curvature and torsion, as 
was described in §[0] (see Figure 0(b)). While promising, 
these levels of discrimination are comparable to our crudely 
estimated cosmic variance. 

The signal strengths S , sv+id (CHDMi,CDMl) for the 
shape statistics applied to the sky catalogs after breakup 
are shown in Figure |l2] b (bottom panel). Greatest discrim- 
ination is seen for BSproi, at a modest ~ 1.5 — 2a level for 
1.7 ^ R/d ^ 2.2. The LVn nc statistic shows some appar- 
ent discrimination at R/d ~ 1.5, but recall for this R this 
statistic does not discriminate a Poisson catalog from the 
cosmological models. None of the other statistics show sig- 
nificant discrimination between these models. 

Overall, we conclude that for the CfAl-like sky cata- 
logs, the best filament statistic is torsion, which clearly has 
the greatest discriminatory power of any statistic with the 
caveat that it may be significantly degraded by cosmic vari- 
ance. Of the shape statistics, the Babul & Starkman pro- 
lateness measure BS pro i gives the most discrimination be- 
tween models when applied to a CfAl-like data set, showing 
some discriminatory power (up to 2a) and good robustness 
against both halo identification uncertainty and cosmic vari- 
ance. The LV and RA shape statistics show little discrimina- 
tory power between models or even from a Poisson catalog in 
such a sparse survey. Testing on 10.384 sr versions of the sky 
catalogs shows that all these statistics are hampered mostly 
by shot noise, thus a larger data set is required to properly 
discriminate between these models. The Optical Redshift 
Survey (Santiago el al. 1995, 1996) which has CfAl depth 
but 8.09 sr sky coverage, will be very useful for this purpose. 
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6.5 Comparing Models vs. CfAl Data 

Using these statistics we can compare the sky catalog re- 
sults directly to CfAl data. For filament statistics shown in 
Figure ^j, the CfAl catalog follows the CDM models more 
closely than the CHDM models. However, given the uncer- 
tainty in halo identification (~ 1°) and cosmic variance, it is 
difficult to conclusively state which model agrees best with 
filament statistics based on the CfAl data set. 

The various shape statistics presented in Figures and 
pi] likewise show that no single model of those considered 
here is completely consistent with CfAl data. The LV and 
RA statistics show best agreement with CHDMi, but these 
statistics have large cosmic variance. BSobi and BS sp h show 
best agreement with the CHDM models, while BS pro i shows 
best agreement with the CDM models. With at best 2a 
discrimination combined with the uncertainties in our esti- 
mates of cosmic variance and halo identification robustness, 
we again cannot favor or rule out any models based on these 
shape statistics applied to the CfAl redshift survey. 



7 CONCLUSIONS 

In this paper we present filament statistics, a new set 
of statistics for quantifying filamentarity and planarity in 
large-scale structure. We compare these statistics to the 
shape statistics of Babul & Starkman (1992), Luo & Vish- 
niac (1995), and Robinson & Albrecht (1996) by introducing 
metastatistics which quantify the discriminatory power and 
robustness of each statistic. We find that when applied to 
the halo catalogs, most of the statistics considered are sen- 
sitive and robust diagnostics of large scale structure that 
effectively discriminate simulations of CDM models from 
simulations of CHDM models, with robust discrimination 
of <; 4cr between CDM and CHDM models. Cosmic vari- 
ance is low for the filament statistics, but more of a concern 
for all the shape statistics except perhaps BS pTO i- The signal- 
to-noise ratio between any model and the Poisson catalog is 
very large for all R > 1.5d, where R is the window radius 
and d is the mean intergalaxy spacing. Finally, all statistics 
show that CHDM contains more sheet-like and filamentary 
structures than CDM, consistent with intuitive expectations 
as well as visualizations done by BHNPK. 

Comparison with redshift survey data must be done in 
redshift space with the appropriate survey geometry. We 
compare CDM and CHDM models to CfAl data by utiliz- 
ing a sample of CfAl-like redshift catalogs constructed from 
each of the simulations, and comparing these "sky catalogs" 
directly to the CfAl survey. When one views the statis- 
tics' results for sky catalogs, it is unclear which statistic 
provides the most discrimination between models. Filament 
statistics tend to show better robust discrimination than the 
shape statistics, but cosmic variance is a concern. Compar- 
ing models to CfAl, we find the filament statistics show, at 
face value, that the CDM simulations provide the best fit 
to CfAl data. On the other hand, for most shape statistics 
the CHDM models appear to be a better fit. In all cases the 
discrimination is poor, and significantly weakened by uncer- 
tainties in halo identification as well as cosmic variance. A 
proper comparison of statistics and of models versus redshift 
survey data must await larger data sets. 



In a broad context, we view filament statistics as illus- 
trative of a new methodology for constructing statistics to 
analyze spatial data. We utilize inertia tensors to charac- 
terize the local mass distribution, similarly to the LV, BS, 
and RA statistics. But rather than deriving combinations of 
tensor moments to quantify structure, filament statistics use 
link sequences to generate new data samples which amplify 
properties of interest in the underlying data set. The link 
sequence approach was conceived of as an intuitive means 
of simplifying the complex topology of the galaxy point set 
while enhancing the sense of approximate connectivity of its 
large-scale isodensity surfaces (which the eye might recog- 
nize as "filamentarity"). Since the link sequences are guided 
by the distribution of galaxies, not bound by it (as in De- 
launay or Voronoi tessellations, see e.g. van de Weygaert 
1991, or minimal spanning trees, see e.g. Pearson & Coles 
1995), they are more likely to be robust against variations in 
the galaxy locations and halo breakup, although as we have 
seen, robustness against galaxy identification in magnitude- 
limited mock redshift catalogs is a trickier issue. Another 
approach for using link sequences is to apply shape statis- 
tics like those of LV, BS, and RA to the newly created data 
sample, producing statistics which may be more discrimina- 
tory than any presented here. We plan to investigate this 
possibility in the future. 

The success of these statistics for the halo catalogs indi- 
cates that larger, denser redshift surveys coupled with larger 
simulations will provide a significant increase in the robust- 
ness and discriminatory power of these statistics versus real 
survey data. A proliferation of such large redshift surveys is 
already underway. On the simulations front, good progress 
is being made in scaling up the size and resolution of cos- 
mological simulations, as well as in constructing constrained 
realizations of the local universe (Primack 1995) by which 
one may avoid uncertainties of cosmic variance. Thus we 
soon hope to have a suite of significantly larger simulations 
of currently favored models which we can compare to these 
large redshift surveys. Finally, there is interesting work being 
done in more realistically handling the overmerging problem 
by combining approximations to hydrodynamics with Press- 
Schecter type formalisms to accurately model the numbers of 
galaxies near the resolution limit of the simulations (Kauff- 
man, Nusser & Steinmetz 1995; Somerville et al. 1996, in 
preparation). In the coming years we hope to establish these 
statistics which quantify the shapes of large-scale structure 
as significant constraints on cosmological models of struc- 
ture formation. 
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Table 1. Halo catalogs (KNP96.NKP96) 



Model 




Bias 


Qrms(M K ) 


Init.Cond. 


No. of Gals. 


d (Mpc) 


CDM1 


1.0/0/0 


b = 1.0 


12.8 


Set 1 


58,121(37,164) 


2.58(3.00) 


CDM1.5 


1.0/0/0 


b = 1.5 


8.5 


Set 1 


61,690(45,592) 


2.53(2.80) 


CHDMi 


0.6/0.3/0.1 


6 = 1.5 


17.0 


Set 1 


34,000(29,151) 


3.09(3.25) 


CHDM 2 


0.6/0.3/0.1 


b = 1.5 


17.0 


Set 2 


34,554(29,765) 


3.07(3.23) 



The number of galaxies and mean interparticle spacing d computed before halo breakup are indicated in parentheses. 
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Table 2. Shape statistics test cases: Line, Plane, and Sphere 

Stat Line Plane Sphere 

LV guad 0.976/0.989 (1) 0.269/0.254 (0.25) 0.012/0.006 (0) 

LV linc 0.953/0.978 (1) 0.035/0.008 (0) 0.004/0.002 (0) 

LVpizn,, 0.000/0.000 (0) 0.906/0.978 (1) 0.029/0.014 (0) 

LVfl at 0.953/0.978 (1) 0.959/0.989 (1) 0.033/0.015 (0) 

BS pro , 1.000/1.000 (1) 0.016/0.002 (0) 0.004/0.002 (0) 

BS ob , 0.000/0.000 (0) 0.959/0.991 (1) 0.012/0.001 (0) 

BS sph 0.000/0.000 (0) 0.000/0.000 (0) 0.909/0.977 (1) 

RA 0.000/0.000 (0) 0.659/0.743 (1) 0.168/0.063 (0) 

Values of individual statistics for three test case random distribu- 
tions. The first value is for R = 5 Mpc, the second is for R = 10 
Mpc, and the value in paranthesis is the analytical value for that 
distribution. 
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CAPTIONS 

Figure [l]. Link sequence generation computational flowchart. R is taken in units of the mean intergalactic spacing d. For 
galaxies in redshift space, d is a function of the Hubble distance r = v/Ho, where v is the radial velocity of the galaxy. 
From the initial galaxy, sequences are propagated in both (opposing) directions along the major axis until termination; if the 
combined number of links is 4 or more, the entire (combined) sequence "qualifies" for computation; else it is discarded. 

Figure ^| Filament statistics (planarity Op, curvature 9c, and torsion 9t) for the halo catalogs versus R/d, with L = d. Error 
bars shown are 3a resampling errors. The statistics show the CHDM models having more structure than the CDM models, by 
well over 4a at most R. Cosmic variance estimated by the difference between CHDMi and CHDM2 is generally comparable to 
resampling error. The Poisson catalog is well discriminated from any model for R/d > 1.3. Note: Values for different models 
are slightly offset in R to improve visibility. 

Figure H Results for selected Luo & Vishniac (1995) statistics and the Robinson & Albrecht (1996) statistic applied to the halo 
catalogs after halo breakup. Error bars shown are 3a resampling errors. The cosmological models are well discriminated from 
the Poisson model for R/d > 1.6. The statistics are sensitive to the cosmological model as well as to the normalization (i.e. 
the bias factor), with CDM1 and CDM1.5 showing markedly different trends with R. Cosmic variance seems to be significant 
for all these statistics, indicating that resampling errors may not be an appropriate measure of total variance. 

Figure ||: Results from the Babul & Starkman (1995) statistics applied to the halo catalogs after halo breakup. Error bars 
shown are 3a resampling errors. These statistics generally show a very similar behavior versus each other and versus the 
Poisson catalog as the LV statistics. The exception is BS pro i, which shows very little cosmic variance for R/d > 1.8. 

Figure ^. (a) Signal strengths 5f cs (CHDMi,CDMl), as defined in equation for 6p, 6c, and 9t applied to the halo catalogs. 
All statistics discriminate fairly well, with planarity showing the most discrimination, (b) Signal strengths (CHDMi, CDM1) 
for the shape statistics applied to the halo catalogs. All statistics show good discriminatory power, with the best ones exceeding 
~ 8a. 

Figure []. Combined signal strengths Sf es (CHDMi,CDMl) as defined in equation ^; compare to Figure |^ to see effect 
of breakup, (a) Sf cs+ID (CHDMi ,CDM1) for the filament statistics applied to the halo catalogs. Comparison with Fig- 
ure ^(a) shows that breakup causes the most degradation for planarity, some for curvature, and none for torsion, (b) 
Sf es _|_irj(CHDMi,CDMl) for the shape statistics applied to the sky catalogs. All statistics are quite robust with respect 
to halo identification uncertainty, with the exception of RA. 

Figure |^. Filament statistics 0p,0c,6t applied to mock-observed (a) Sp/p > 80 and (b) 5p/p > 120 halo catalogs with 
velocities scaled from velocity factor Fy — (real space) to Fv = 5 times their actual value, with R/d = 1.5. Error bars 
shown are la resampling errors. Going from real space {Fy = 0) to ordinary redshift space (Fv = 1) decreases the amount of 
structure detected, but actually increases discrimination between models. Note: Values for different models are slightly offset 
in Fv to improve visibility. 

Figure^ Redshift distortion test applied to catalogs cut at & > 80 and 5£ > 120, then redshifted by their line-of-sight peculiar 
velocity multiplied by the velocity scaling factor Fy. Error bars shown are la resampling erros. While redshift distortion tends 
to lower the amount of struture detected, the discrimination between models is generally unchanged. 

Figure Filament statistics for the sky catalogs, versus R in units of d(r), the mean interparticle spacing. Error bars shown 
are la sky variance errors. Errors are larger than in the halo catalog statistics due to sparseness, and Poisson is not as well 
discriminated from models. CDM shows significantly less planarity, curvature, and torsion than CfAl, while CHDM shows 
slightly too much. CfAl does not match with any single catalog over all R, but does follow CHDM2 better than the other 
models, especially for 1.2 < R/d < 2.0. The signal-to-noise ratio between CfAl and Poisson is highest at R/d = 1.3 (note 
the small Poisson error bar) for all statistics, indicating optimal sensitivity at this R. Note: Values for different models are 
slightly offset in R to improve visibility. 

Figure [l(J Results from selected Luo & Vishniac (1995) statistics and the Robinson & Albrecht (1996) statistic applied to 
the sky catalogs after halo breakup. Error bars shown are la sky variance errors. The large errors due to the sparseness of 
the catalogs yield a low discrimination between models. Also, the Poisson catalog shows significant structure aliasing at all 
values of R/d. Versus the CfAl data, no model is ruled out at more than a 2a level, not including cosmic variance. 

Figure [ll| Results from the Babul & Starkman (1995) statistics applied to the sky catalogs after halo breakup. Error bars 
shown are la sky variance errors. The prolateness statistic, which most discriminatory of the three, also shows low cosmic 
variance by the crude estimate of comparing CHDMi to CHDM2. The others do not show good discrimination, and cosmic 
variance appears to be larger, although still comparable to sky variance errors. 



© 1996 RAS, MNRAS 000, 000-000 



Filament and Shape Statistics 17 

Figure [l^: (a) Combined signal strength S' sv+ id(CDMl,CHDMi) for filament statistics applied to the CfAl-like sky catalogs. 
Torsion clearly shows the highest discrimination, while curvature also shows some discrimination. Cosmic variance, specifically 
excluded in this comparison, is significant at all R. 

(b) Combined signal strength 5 sv +id(CDMl,CHDMi) for shape statistics applied to the sky catalogs. No statistics shows good 
discriminatory power, with the best one, BS pro i, barely reaching 2a. 
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Figure 1. 
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Filament Statistics for Halo Catalogs 
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Figure 2. Halo Catalogs Filament Stats 
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Selected LV and RA Shape Statistics Applied to Halo 
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Figure 3. Halo Catalogs LV & RA Stats 
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Selected BS Shape Statistics Applied to Halo Catalogs 
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Figure 4. Halo Catalogs BS Stats 
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Signal Strength vs. R: Halo Catalogs 

— i — i — | — i — i — i — | — i — i — i — | — i — i — i — | — i — i — i — | — i — i — i — | — i — i 

— 14 

(a) Filament Stats: 

9 P Planarity Z 

9 C Curvature — 12 

/\ d T Torsion Z 

i \ 
i \ 

i ^ — 10 

/ "> 

\ , - .-'v -8 

\ ^ ^ "~ ~ ^ \ _ 

\ /\ 
^ ' \ 

I \ 

/ \ — 6 

' \ 
s / \ 

\ ^ 

4 

— 2 

J I I I I I I I I I I I I I I I I I I I I I I I I I iZZ g 

~| — i — i — i — | — i — i — i — | — i — i — i — | — i — i — i — | — i — i — i — | — i — i — i — | — i — rn 

~ (b) Shape Stats: . ~ 14 

" LV Q Filamentarity /\ Z 

- LV P Planarity / \ — 12 

- |_\/ L Linearity / \ Z 

- BS P Prolateness / \ 

- BS Oblateness /\ / X x / \ ~ 10 

- BS S Sphericity />~'jt\ \ /\ \ 

_ RA Flatness^ f - VS^ 1 '/ >\ / /' ^ 8 

' ■;>< ' / " - - - " 

—y * / — ' 

- ____ / 

J i i i I i i i I i i i I i i i I i i i I i i i I i lJ o 

1.2 1.4 1.6 1.8 2 2.2 2.4 

R / d 



14 



12 



10 




14 

12 

10 




Figure 5. Signal strength for statistics applied to halo catalogs after breakup 
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Combined Signal Strength vs. R: Halo Catalogs 
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Figure 6. Combined signal strength for the halo catalogs 
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Redshift Distortion Sensitivity 

(a) Sp/p > 80 Halo Catalogs (b) 6p/p > 120 Halo Catalogs 
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Figure 7. Redshift distortion test - Filament statistics 
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Redshift Distortion Sensitivity: Shape Statistics 

(a) 6p/p > 80 Halo Catalogs (b) Sp/p § 120 Halo Catalogs 
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Figure 8. Redshift distortion test - Shape statistics 
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Figure 9. Sky Catalog Filament Stats 
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Selected LV and RA Shape Statistics Applied to Sky 
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Figure 10. Sky Catalog LV & RA Stats 
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Selected BS Shape Statistics Applied to Sky Catalogs 
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Figure 11. Sky Catalog BS Stats 
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Combined Signal Strength vs. R: Sky Catalogs 
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Figure 12. Combined signal strength for the sky catalogs 
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